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Abstract 

The American government set new standards mandating States to demonstrate adequate yearly 
progress for all students with the inception of the No Child Left Behind Act. To be eligible for the more 
recent Race to the Top funds, states must show, in part, a commitment to “building data systems that 
measure student growth and success, and inform teachers and principals how to improve instruction ” 
(DOE, 2009, p.2). This sweeping education reform focuses on the need for formative data collection 
systems. Many states are meeting this challenge by implementing a Responsiveness to Intervention (Rtl) 
model. As such, the demand for effective progress monitoring systems and powerful instructional 
interventions has never been greater. This paper illustrates how a multi-level assessment system 
assessment aligns annual normative measures with daily frequency building practice and curriculum- 
based measurement probes within an Rtl framework to drive academic outcomes. 
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Responsiveness to Intervention (Rtl) refers to a recent innovation in education utilizing a multi¬ 
tiered service delivery model with two overlapping functions: first, to identify students who are struggling 
in the classroom and remediate academic deficits, and second, to distinguish between students who are 
behind due to a history of poor instructional experiences and those in need of special education services 
for remediation of an actual learning disability. (Jenkins, Hudson, & Johnson, 2007). Rtl promotes a new 
focus on teaching and learning, focusing on how responsive students are to instruction. The term as 
originally coined, “Responsiveness” places the agency or label of special education on the teaching 
methodologies and measures student responsiveness to those procedures. 

Rtl was derived from the provisions outlined in the Individuals with Disabilities Improvement 
Act of 2004 (IDEA, 2004), which states that “in determining whether a child has a specific learning 
disability, a Local Education Agency may use a process that determines if the child responds to scientific, 
research-based intervention as part of the evaluation process” [Section 614 (b)(6)(B)], As such, Rtl offers 
an alternative to the traditional practice of diagnosing learning disabilities based on a pronounced dual 
discrepancy between intellectual capacity (as determined by intelligence tests) and academic proficiency 
in various subjects (as determined by achievement tests). Rtl is not mandated, but IDEA 2004 now 
prohibits states from requiring this discrepancy model. 

In many ways, Rtl constitutes a profound paradigm shift in the way that students with educational 
problems are perceived and taught in the classroom. According to the traditional approach, if a significant 
dual discrepancy is observed between intelligence test scores and achievement scores, the problem is 
generally considered to exist within the student. The student is then labeled with a learning disability and 
committed to the special educational system. If a significant discrepancy is not observed, the student 
returns to the general education classroom. Due to strict qualification guidelines related to the current 
provision of special education services, funding to provide additional support to students that are only 
marginally failing is not generally available. Yet, it’s clear that without an effective intervention, the 
deficits are only likely to increase. For this reason, the dual discrepancy model is often referred to as the 
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“wait-to-fail” model and has come under increasing widespread criticisms as being an ineffective and 
inadequate framework for special education (Francis et al., 2005). In contrast to the dual discrepancy 
approach, the Rtl framework emphasizes identifying and supportng all students with pronounced 
academic deficits. This change in perspective of how to provide services has even led to a new term, “the 
enabled learner” (Tilly, 2006) and is creating a challenge for our school psychologists to move from the 
use of traditional psychometric tests (i.e. intelligence and achievement tests) to an “edumetric” problem 
solving model focused on measuring changes in individual performance over time (Canter, 2006). In 
summary, "Rtl is a set of scientifically research-validated practices that are deployed in schools using the 
scientific method as a decision-making framework" (Tilly, 2006 p. 22). 

Rtl Framework 

When a school-wide approach is adopted, the Rtl framework most commonly utilizes what is 
referred to as the Standard Protocol Model (Shores & Chester, 2009). The model was based on the 
research in curriculum-based measurement of reading skills conducted by Deno and Mirkin (Deno, 1985, 
2003; Deno & Mirkin, 1977). CBM grew out of the need for educators to access more frequent 
performance data in the academic foundation skills of reading, spelling, writing, and mathematics (Deno, 
1985; Shinn 1989). Teachers can use these criterion-referenced assessments to compare student progress 
to a grade level standard as well as to analyze individual growth compared to previous performance. The 
Standard Protocol Model shown in Figure 1 is typically conceptualized as a pyramid or triangle with three 
tiers of intervention. 



Figure 1. Rtl Standard Protocol Model 

In Tier 1, risk status may be established with the use of universal screening measures using 
benchmark scores, standardized achievement test results, or median scores from several progress 
monitoring measures (Stecker, 2007). If an academic deficit is observed, the problem is initially assumed 
to reside within the instructional environment rather than within the student. For instance, if most students 
demonstrate poor performance, then the teacher may need additional training. If the data indicate that a 
small percentage of students are not responding to a high-quality evidence-based core education program, 
then smaller group and more time-intensive intervention is provided in Tier 2. At this level, progress is 
generally monitored more frequently. Students who do not demonstrate satisfactory progress in Tier 2 
commensurate with peers become candidates for Tier 3 intervention, where even more time-intensive 
interventions are employed with even smaller group sizes (Vaughn & Linan-Thompson, 2003). Services 
can be provided by general education teachers or special education teachers. Students are ultimately 
identified as eligible for special education services when their response to effective instruction is 
significantly inferior to that of peers (Vaughn & Fuchs, 2003). More specifically, students are classified 
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with a learning disability if their rate of growth based on progress monitoring data and their level of 
performance are more than 1 standard deviation below the mean level and slope of their classmates 
(Ardoin, Witt, Connell, & Koenig, 2005). Through this process it is possible to differentiate between 
students with a true learning disability and those that are under-achieving due to a history of poor 
instructional practices (Vaughn & Fuchs, 2003). 

According to the Response to Intervention Adoption Survey of 2009, the Rtl framework is 
becoming increasingly popular in mainstream education as an alternative to the dual discrepancy model. 
With the support of the American Association of School Administrators, the Council of Administrators of 
Special Education, and the National Association of State Directors of Special Education, the results of 
this survey showed that 71% of respondents reported to be either piloting Rtl, implementing Rtl, or are in 
the process of a district-wide Rtl implementation. These results are compared to 60% in 2008 and 44% in 
2007 (Pascopella, 2010). Yet, the way in which these schools are implementing Rtl remains unclear and 
likely varies widely. 

Implementation Variability 

Rtl seems to hold great promise as an improved framework that by design appeases two legally 
mandated protocols within the American educational reform movement today. Those are, firstly, the 
identification and remediation of students with learning deficits and secondly, the district’s reporting of 
the adequate yearly progress of its students with an emphasis on creating data systems that inform 
administrators and educators as to the progress of all students. However, Rtl still lacks strong empirical 
support - especially in relation to (1) interventions that are yoked to the use of “technically sound 
instruments” as required by federal law (Kame’enui, 2007); (2) the procedures and criteria that should be 
followed to move students between tiers (Stecker, 2007); as well as (3) the optimal frequency of progress 
monitoring assessments used within different tiers (for discussion see Fletcher, 2006; Fuchs & Fuchs, 
2006). Until the instructional practices, decision-making processes, and the consequences of various 
assessment schedules inherent within Rtl frameworks are fully developed and empirically established, the 
quality of implementations and the associated outcomes will presumably continue to vary widely. 

Therefore, the purpose of the present investigation was to broaden current research on Rtl 
frameworks by analyzing the application of a multi-level system of assessment that uses Precision 
Teaching, a frequency building 1 instructional intervention designed to methodically improve student 
progress. We decided to focus our analysis on mathematics outcomes because the majority of Rtl research 
studies to date have examined processes related to reading achievement. Much less research has been 
devoted to math applications, yet the prevalence of students identified as having a learning disability in 
math is similar to the incidence of those identified as having reading disabilities (Gross-Tsur, Manor, 
Shalev, 1996). 

Using Frequency-Based Performance as a Screening Measure in Mathematics 

It is estimated that between 5-8% of all school age children have a math disability (MD) as 
determined by the dual discrepancy model (Geary, 2003). A math disability typically manifests as 
problems in simple arithmetic such as number sense, number and operations, and word problem solving 
(D. P. Bryant, Bryant, & Hammill, 2000; Fuchs et al., 2004). These basic performance deficits can lead to 


1 . 

As noted by Lindsley (1991), “Rate was the measure of operant behavior used in the animal laboratories 
(Ferster & Skinner, 1957; Keller & Schoenfeld, 1950; Skinner, 1938). In Precision Teaching the term 
frequency is used instead of rate because it is more readily understood by non-psychologists. Furthermore 
in one of his more general books, Skinner (1953, p. 62) himself used the term frequency when describing 
behavior” (p. 254). 


228 



The Behavior Analyst Today 


Volume 11, Number4 


long-term academic problems (Geary, 2003) and to the development of disturbing behavior (Pisecco, 
Wristers, Swank, Silva, & Baker, 2001). 

Research suggests that the absence of fluency in basic math skills limits the ability to solve more 
complex problems and understand more advanced concepts (Geary, 2003; Gersten, Jordan, & Flojo, 
2005). As such, the fluency metric can be used to uniquely discriminate between expert and novice 
performance. To illustrate this point, Fleischner, Garnett, & Shepherd (1982) compare the math fact 
computation skills of primary school students who are identified as having learning disabilities with 
average students and found performance was essentially indistinguishable based on the measure of 
percentage correct. On timed assessments however, students with learning disabilities completed only 
one-third as many math fact problems as their non-identified peers. On the basis of this research and 
similar findings highlighting the ability of frequency-based measures to uniquely distinguish between 
advanced and at-risk students, most universal screening and progress monitoring measures used within 
Rtl frameworks are rate-based measures. 

The Power of Frequency-Based Instruction: Precision Teaching 

While frequency-based assessment can uniquely identify struggling students, frequency-based 
instruction can be used to drive learning outcomes. Fluent performance is defined as true mastery and 
demonstrated when an individual can perform a task smoothly, accurately, and without hesitation (Binder, 
1990). In the early stages of B.F. Skinner’s study of human behavior, he identified continuous 
measurement and rate of performance as key metrics with which to study human performance (Skinner, 
1953). In the 1960’s, Dr. Ogden Lindsley created the Precision Teaching methodology and its visual 
graphic representational tool, The Standard Celeration Chart (illustrated in Figure 2). Precision Teaching 
adheres to Skinner’s early laboratory findings highlighting the importance of rate as a critical measure of 
human performance and applies these findings in the educational arena (Lindsley, 1972). 



Figure 2. The Standard Celeration Chart Used to Tack Frequency-Based Performance 

“Precision Teaching is adjusting the curricula for each learner to maximize learning shown on the 
learner’s personal standard celeration chart. The instruction can be by any method or approach” 

(Lindsley, 1991, p.259). The system of PT therefore allows educators the freedom to present any 
instruction, specify objectives, state pinpoints, and analyze individual performance with a rate-based 
measure (Binder & Watkins, 1990; Johnson & Layng, 1992,1994; Johnson & Street, 2004). Skill 
acquisition is observed and compared on an individual basis; students need not be compared with one 
another. Mastery is then defined by both accurate and fluent performance. As such, PT can be further 
used to evaluate the effectiveness of a particular instructional method or approach. 
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Within a PT protocol, teachers measure student performance in small increments of time, such as 
one minute. Instructional concepts are broken down into component curricular pieces that combine into 
composite skills. For instance, students may practice building frequency in the component skills of 
number writing and skip counting before learning the composite skill of basic multiplication. Students 
practice these component pieces of instruction to build high rates of correct responding and to reduce 
rates of incorrect responding. Correct and incorrect performance rates are tracked simultaneously on the 
individual student’s Standard Celeration Chart. Consequently, “the chart” provides both snapshots of 
academic skills at one moment in time and learning ability over long spans of time. While it is beyond the 
scope of this paper to describe the features of PT, further description is provided by Johnson and Layng, 
(1992), Binder (1988), Lindsley (1972), as well as Pennypacker, Koenig and Lindsley, (1972). 

Many successful applications of using PT to accelerate academic outcomes have been reported in 
special education and general education settings. Some of the earliest research was conducted by Eric 
Haughton- an early PT pioneer. In the area of math education, Haughton (1972, 1980) showed that a 
program of building tool skills such as math facts and number writing to a fluent rate improved 
underachieving students’ math performance to the level of their competent peers. Beck and Clement 
(1991) extended these findings by demonstrating how frequency-based academic interventions can be 
successfully implemented in general education settings (Beck & Clement, 1991). In their study, referred 
to as the Great Falls Precision Teaching Project, public school teachers conducted daily frequency 
building sessions with their students for 20 to 30 minutes across a range of basic skills. Students 
completed (1) two 1-minute timings in two different academic skills, (2) recorded their best timed 
performance, and (3) monitored progress relative to the mastery-based rate criteria needed to advance 
through increasingly complex curriculum objectives. Within three years, students in the school district 
improved between 19 to 44 percentile points on the reading, writing, and math subtests of the Iowa Test 
of Basic Skills. 

As a final example, for the past three decades, Momingside Academy, a private school and 
professional development provider, has employed PT methodologies in its Morningside Model of 
Generative Instruction (Johnson & Layng, 1992, 1994; Johnson & Street, 2004). With procedures that 
focus on building component skills to a fluent rate, this approach typically allows children and youth to 
gain two grade levels per school year. As is co mm on in the initial assessments used within Rtl 
frameworks, students are given precision placement tests to determine if a skill is (1) at an instructional 
level, (2) accurate but not fluent, or (3) accurate and fluent. With rigorous instruction based upon sound 
instructional design principles, students move from acquiring skills to achieving fluent levels of 
performance by setting and reaching daily goals targeted on the Standard Celeration Chart. Mathematics 
instruction employs practice in a range of component skills (e.g., digit writing and basic math fact 
computation) and composite skills (e.g., multi-digit multiplication computation and word problem 
solving). 

Previous Rtl Application Studies in Math 

Several studies have examined Rtl models developed and implemented for research purposes or 
otherwise described an Rtl model already in place to develop student math skills. For instance, Fuchs et 
al. (2004) analyzed the effect of a 16-week Tier 2 math problem solving intervention for third grade 
students (n=301). The TerraNova Achievement Test was used as the universal screening measure to 
determine at-risk status for a math disability (MD), reading disability (RD) or both (MRD). Progress was 
monitored using rate-based probes included in the Monitoring Basic Skills Progress Assessment (MBSP) 
Math Computation and Math Concepts and Applications tests (Fuchs, Hamlett, & Fuchs, 1998). A 
complete description of these measures is provided in the methods section. The intervention consisted of 
problem solving instruction and practice. Controlled comparisons were made to students identified as not 
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at-risk that either received or did not receive the treatment. Differential levels of responsiveness were 
observed as a function of risk status, whereby students not at-risk demonstrated the most improvement. 

Bryant et al. (2008) examined the effects of a Tier 2 mathematics intervention with 1 st grade 
students (n=42). The Texas Early Mathematics Inventories: Progress Monitoring (TEMI-PM) test score 
were used as the universal screening measure in Tier 1. Students that scored below the 25 th percentile 
were selected for Tier 2 intervention. The 20-minute intervention was delivered four days per week for 
23 weeks. Similar to the Fuchs et al. (2004) study, instruction provided in the Tier 2 intervention was not 
differentiated to address individual academic deficits such as problems with basic number writing, 
subtraction, division, or answering math facts. Instead, instruction for all students emphasized basic 
number skills (e.g., counting, sequencing, and comparing numbers), place value, and 
addition/subtraction combinations. Results showed that posttest scores for the at-risk students were 
significantly higher than expected based on pretest scores which resulted in a main effect for the 
intervention. 

Ardoin and colleagues (2005) implemented a three-phase RTI model with 4 th grade students in 
two classes. Universal screening measures used in Tier 1 consisted of curriculum-based measurement 
(CBM) probes in addition, subtraction and multiplication which were developed using the website 
www.interventioncentral.com. Students were instructed to answer as many problems as possible within 2 
minutes on each probe. The probes were used as the dependent measures to monitor progress across each 
tier of intervention. The screening data indicated that class wide deficits in certain types of subtraction 
problems existed (4-digit by 4-digit with regrouping), thus a class wide intervention targeting related 
skills was implemented. In order to tailor instruction to instructional need, two more CBM probes were 
implemented breaking down the deficit composite problem into some of its component parts (2-digit by 
2-digit subtraction without regrouping and 2-digit by 2-digit subtraction with regrouping). Results 
indicated that intervention should begin with instruction in 2-digit by 2-digit subtraction without 
regrouping. As an interesting addition to this study, the authors controlled for motivation effects. This 
was done by setting up daily “goals” for the students to exceed their previous scores in repeated practice. 
Upon meeting their daily improvement goal, students would be allowed to select from a variety of 
rewards (e.g., note pads, small toys). If as a result of the implementation of this motivation system, a 
students’ second score exceeded baseline performance by at least 20% (Noell, Freeland, Witt, & 

Gansele, 2001) and met instructional criteria based upon suggested 4 th grade rates of fluency by Shapiro 
(1996) (i.e., 40-88 digits correct in 2 minutes), then those students were considered to have motivational 
deficits (“won’t do”) rather than skill deficits (“can’t do”). For those students whose scores did not 
change within the motivation system, a Tier 2 class-wide intervention was implemented. In Tier 3, more 
intensive instruction was provided to the students (n=5) who did not respond adequately to the class¬ 
wide intervention. Results revealed that only one student did not respond to Tier 3 intervention. 


How the Present Study Extends the Literature 

Although each of the studies described above utilized a frequency-based assessment system to 
identify struggling learners and to monitor progress, none provided differentiated instruction to address 
the academic deficits of individual students or utilized mastery-based criteria to facilitate differentiated 
Tier 2 instruction. The purpose of the present investigation is to illustrate how a multi-leveled system of 
assessment combined with PT interventions in mathematics can be incorporated into an Rtl framework. 
The goal of this process is to better target individual student weaknesses and thereby further accelerate 
learning outcomes for all students. 


Framework of the Multi-Level Assessment System 
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The hallmark feature of the multi-level assessment system is to align daily frequency building 
practice on deficit component skills with weekly standardized fluency probes on composite skills in order 
to monitor and ensure progress on annual normative assessments. To this end, three levels of assessment 
intensity are prescribed: Macro, Meta, and Micro. Figure 3 is designed to help teachers and administrators 
understand the relative importance of each assessment level. These levels of assessment are appropriate 
for any tier of Rtl implementations. 



Figure 3. Multi-Level System of Assessment 

Macro level assessment. The “Macro” level includes the utilization of annual standardized 
assessments that are most often norm-referenced, but teachers may also use criterion-referenced 
assessments. The specific types of tests employed are most likely determined by the adopted local, state 
or national requirement. As a result of the No Child Left Behind legislation, states are mandated to report 
scores to the federal government in 4 th grade, 8 th grade and high school. Consequently, most states have 
enacted a state specific assessment program with which to report the adequate yearly progress (AYP) of 
their students. In addition, some school districts administer supplemental tests more regularly in an 
attempt to monitor student progress annually. Examples of widely accepted annual normative assessments 
are the Iowa Test of Basic Skills (ITBS), the Woodcock-Johnson Tests ofAchievement-Ill (WJ-III), the 
Stanford Achievement Test (SAT) and the Wechsler Individual Achievement Test-II (WIAT-II). Teachers 
may not need to administer assessments themselves if annual assessment scores required by the school 
district are available. At the minimum, parallel alternate form s of one assessment protocol should be 
utilized consistently year to year in order to better analyze student progress relative to previous 
performance on the same assessment. 

The Macro level scores serve two main agendas. First, the aforementioned analysis of progress on 
repeated administrations of the same measure illustrate where student performance lies in comparison to 
same age peers as well as relative to previous performance. The second major use of Macro level scores is 
placement in curricular sequences. For norm-referenced assessments, the grade equivalent scores are used 
to establish a present level of performance for precision placement in state adopted grade-leveled 
curriculum. 

Meta level assessment. The “Meta” level of assessment is defined by the use of weekly 
Curriculum-Based Measurement (CBM) tools that closely align to the curricular content being taught 
daily in the classroom. In mathematics, these assessments may include the measures reviewed in the 
previous Rtl application studies. In addition to progress monitoring, CBM measures can be used to set 
annual goals because the procedures allow for three types of referencing: curriculum, normed, and 
individual (Malmquist, 2004). Curriculum referencing occurs because the materials are intended to be 
representative of the classroom’s curriculum. School administrators can thereby compare the scores of 
their own students within and across the district in order to establish local norms for comparison. Finally, 
the student’s own scores can be tracked and analyzed for progress and need for intervention. Since the 
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CBM measures are closely aligned with the district’s curriculum, teachers can continually analyze scores 
to see if each student is placed in appropriate grade level material or if a student is ready to advance to the 
next curriculum level. 

The Meta level assessment data can be used to make various intervention decisions. For instance, 
after reviewing data teachers may decide to provide additional learning opportunities for students to 
engage frequency building practice on composite skill applications that the student can perform 
accurately, but not fluently. Conversely, the teacher may decided to break down composite skills that the 
student does not perform accurately into their component parts and provide frequency building practice 
opportunities in those component parts. Teachers can then analyze whether the daily practice sessions are 
having an effect on the student’s ability to perfonn the tasks on the weekly probes. 


Micro level assessment. The “Micro” level of assessment is the most sensitive measurement level. 
It reflects the frequency building work students engage in daily. At this level, students engage in 
deliberate practice of component and composite skills, which are measured as count-per-minute 
increments and recorded on the Standard Celeration Chart. PT interventions are provided when a student 
repeatedly does not reach a daily frequency goal. For example, a student whose written single digit math 
computation rate does not increase as projected may temporarily switch from written practice to oral 
practice. Once adequate oral responding is achieved, the student would resume written practice. 


Instructional Group Placement and Movement Between Rtl Tiers 

Homogeneous placement within and across classrooms is the most efficient way to utilize the 
multi-level system of assessment as described in this paper. In many American schools, there may exist 
multiple students across classrooms that display skills that are functioning significantly behind their 
same-age peers. If they are grouped according to their academic skill level, and not their age, then they 
can be monitored and moved through the curriculum accordingly. When school-wide homogeneous 
placement criteria are used, educators should consider the academic level of the student as well as the 
social appropriateness of the placement. In general, a good rule of thumb is to have no more than four 
years of age difference within the group. Therefore, even if an 11 year old is performing significantly 
below grade level, that student would not be placed with a 6 year old even if the two students were 
functioning at the same academic skill level. Students who are one to three grade levels behind according 
to their Macro scores are placed in Tier 2, which consists of differentiated instruction and supplemental 
frequency building practice. Students who are four or more grade levels behind are placed in Tier 3, 
which also consists of differentiated instruction and supplemental frequency building practice. However, 
at this level, students receive instruction in smaller group or on an individual basis with paraeducator 
support. In the current study, supplemental practice provided for students in Tier 2 and Tier 3 occurred in 
the general education classroom during small group instruction with the classroom teacher or a 
paraeducator under the direction of the classroom teacher. 


All students are monitored using Meta level assessment tools, regardless of the tier in which the 
student is placed. In contrast to most Rtl models, all students in the current study were assessed with the 
same frequency within each tier. The difficulty level of assessment used was matched to the grade level of 
the curriculum placement. For instance, a 6 th grader student who requires 2 nd grade level instruction 
according to Macro level assessment data, was assessed using 2 nd grade level CBM materials. These Meta 
level assessments are used until the student’s performance indicates readiness to move up in grade level 
according to the exit criterion embedded in the CBM protocol. Once students in Tier 3 close the gap to 
three or fewer grade levels, they are placed in Tier 2. Likewise, once students in Tier 2 reach the grade 
level material matched to their age, they move to Tier 1. In this way, the Rtl framework used in 
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conjunction with the multi-level system of assessment assures that even those students who are initially 
significantly behind their same-age peers will be afforded the possibility of “catching up” to their peers. 


Application of the Multi-Level System of Assessment in Mathematics 
Participants & Setting 

To illustrate the multi-level system of assessment, the mathematics progress of two 4 th grade 
students placed in Tier 2 intervention were examined in the present study. Both students attended a small 
private school in Seattle, WA where students with and without disabilities are educated side-by-side in an 
inclusion model. The specialized focus of the school is to educate students on the autism spectrum 
alongside their typically developing peers. 

At the time of her entrance into the program, Sarah was 11 years old and was falling behind her 
same age peers in math. Prior to 4 th grade, Sarah was evaluated by private professionals in order to 
determine if she had a diagnosable condition. By all accounts, her test results did not indicate any learning 
disabilities. However, according to the public school records from her previous elementary school, she 
was qualified for special education services and given an Individualized Education Plan (IEP) under the 
eligibility category of Specific Learning Disability, which allowed her to access extra support in math. At 
the time, her public school district was not using the Rtl model to identify and serve struggling learners 
within the general education classroom. Consequently, the only option the teachers had to provide 
additional support for Sarah was through the IEP system. Sarah entered 4 th grade as a shy student 
displaying very little confidence in her math skills, however she excelled in every other academic subject 
where she tested at or above grade level. She displayed extraordinary empathy and willingness to support 
the other students and, despite her initial shyness, she developed some very impressive leadership skills. 


Mason joined the program as a sweet natured, shy 10 year old boy whose previous educational 
experience consisted of a combination of home schooling with his mother and another private school 
placement for three days per week. Mason’s diagnosis placed him on the higher functioning end of the 
autism spectrum. He was compliant, displayed no behavioral difficulties, and was eager to please adults. 
He had marked deficits in higher level language processing and reasoning skills which manifested 
themselves in difficulty with reading comprehension, oral language, and social interaction with peers. 
Upon entrance into the program, his math calculation and concrete thinking skills were in the average 
range overall for his grade level. However, his math fluency score was at least one grade level behind his 
peers. Moreover, his higher level thinking and language processing deficits were already impacting his 
ability to successfully complete math reasoning and word problem examples and he was becoming 
increasingly anxious when completing these tasks during math class. 


Due to space limitations, only Sarah’s educational program will be detailed. However, it should 
be noted that the instructional methods and decision-making processes were identical for both Sarah and 
Mason. The purpose of including the results of both students is to show how the same multi-level system 
of assessment integrated within an Rtl framework can be applied to improve the academic outcomes of 
diverse students. 


Measures 

Several Macro, Meta, and Micro assessments were employed in the current investigation. The 
assessments used within each level are diagrammed in Figure 4. Note that the model for effecting Math 
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Fluency subtest scores is distinguished from the Model for Impacting Math Applied Problems, Concepts, 
and Calculation Subtest Scores. 



General Model Model for Impacting Model for Impacting 

Math Fluency Subtest Scores Math Applied Problems, Concepts, 

and Calculation Subtest Scores 

Figure 4. Multi-Level System of Assessment for Impacting Math Subtest Scores 


At the Macro level, four subtests of the Woodcock-Johnson Tests of Achievement-Ill (WJ-III; 
Woodcock & Johnson, 1989) were administered at the beginning of the school year to facilitate placement 
in the math curriculum. The Calculation subtest measures the ability to perform mathematical 
computations in addition, subtraction, multiplication, division, geometry, trigonometry, as well as 
logarithmic and calculus operations. The Math Fluency subtest assesses the ability to solve simple 
addition, subtraction, and multiplication facts correctly within three minutes. The Applied Problems 
subtest requires the subject to analyze and solve math problems across a variety of genres: such as time, 
money, word problems, fractions, decimals, and percentages. Finally, the Quantitative Concepts 
assessment measures knowledge of mathematical concepts, symbols, and vocabulary. Progress at this 
level was measured at the beginning and end of the school year. 

At the Meta level, the Computation CBM test and the Concepts and Applications CBM test from 
the Monitoring Basic Skills Progress (MBSP) assessment were used (Fuchs, Hamlett & Fuchs, 1998). The 
Computation test measures student ability to complete grade level whole number computation problems 
in a predetermined amount of time; time limit varies with level of CBM administered. The Concepts and 
Applications test measures student ability to complete grade specific mathematics concepts in addition to 
whole number computation, such as word problems, measurement, fractions, decimals, money, and other 
concepts depending upon grade level. The student’s score is calculated by counting the number of 
responses written at the end of the prescribed time period. The score is then compared to a norm- 
referenced table indicating student rate of performance compared to the distribution of students in the 
normative sample. Accordingly, decision rules as utilized within the multi-level system of assessment 
suggest that students who are achieving scores at the top 75 th percentile are moved up to the next grade 
level of both Saxon math instruction and the accompanying grade equivalent CBM measures. Teachers 
typically share the value of this analysis with the student. Students like Sarah, whose skills are well below 
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same age peers, are afforded the possibility of catching up to peers because they are not required to stay in 
one level of curriculum for the entire academic year. This assessment is also timed according to the CBM 
grade level administered. Progress at this level was measured weekly using the Standard Celeration Chart. 

At the Micro level, students built frequency on digit writing, basic math fact computation, and 
concepts and application in third grade Saxon math. Typical practice sessions require approximately 10 
minutes of frequency building practice per Micro level activity. Progress at this level was monitored daily 
using the Standard Celeration Chart. 

Placement and Intervention 

The pretest Macro level assessment results are provided in Table 1. The data show that Sarah’s 
pretest grade equivalent scores ranged from approximately half a year behind in Applied Problems to 
approximately 2.5 grade levels behind in Quantitative Concepts. 

Table 1. Macros Level Pretest Scores Used for Placement 




Eligibility 

WJ-III 

Pre Test 

Student 

Grade 

Category 

Subtest Title 

GE score 




Calculation 

3.0 

Sarah 

4 

Learning 

Math Fluency 

2.2 



Disability 

Applied Problems 

3.5 




Quantitative 

Concepts 

1.4 




Calculation 

4.4 

Mason 

4 

Autism 

Math Fluency 

3.0 




Applied Problems 

4.2 




Quantitative 

Concepts 

4.2 


Note. WJ-III = Woodcock-Johnson III Achievement Test; 

GE = Grade Equivalent. 

Since Sarah scored one to three grade levels behind on the Macro level assessments, she received 
Tier 2 intervention. Her teachers chose the curriculum grade level that aligned most closely to her 
performance level on the WJ-III assessment: Saxon 3 math (Larson et al, 1994). This was an approved 
public school curriculum in the state of Washington where she resided. In addition, the teachers 
established her baseline performance on the 3 ld grade level weekly curriculum based measurement probes 
for Computation and Concepts and Applications using the Monitoring Basic Skills Progress (MBSP) 
assessment (Fuchs, Hamlett & Fuchs, 1998). As per the protocol in the administration manual, Sarah was 
given 3 minutes to complete the 3 rd grade Computation assessment and six minutes to complete the 3 ld 
grade Concepts and Applications assessment. 

As the school year progressed, Sarah and her teachers analyzed her CBM data as well as her 
progress within the Saxon curriculum. If the data indicated that Sarah needed additional practice in 
concepts due to error patterns in her lessons or in the weekly CBM probe, then Sara resumed daily 
practice on those skills at the Micro level of assessment utilizing a Precision Teaching approach. The 
pinpoints for the daily practice targeted either component or composite skills. In Sarah’s case, she needed 
additional practice in both types of skills. For the component skills, Sarah engaged in the daily frequency 
building practice of writing her numbers fluently so that she could write more quickly and thus answer 
more questions on the weekly computation probe. The teacher noted that she was solving computation 
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problems accurately, but she hesitated on simple facts, which resulted in her answering fewer questions 
during the time allotted in the weekly CBM probe. Therefore, she began to practice building her 
frequency of responding on single digit math facts for all operations (addition, subtraction, multiplication, 
division). For the composite skills, Sarah was generally accurate in solving most concepts taught directly 
from the Saxon curriculum, but was very slow and frequently got “lost” in processes. This meant that 
during her daily Math class, Sarah would build frequency in her responses on whole number computation 
problems which aligned to the type and difficulty of problems taught in the Saxon lessons on that day. In 
addition, Sarah was given what the teacher affectionately titled: a “Saxon Dump Chart.” The daily 
practice was analyzed on a Standard Celeration Chart. The instruction shifted between any and all 
concepts taught in the curriculum and assessed on the weekly CBM probe for those areas in which Sarah 
needed more frequency building practice. The Micro and Meta level assessments showed she was not able 
to perform these concepts correctly either in the daily Saxon lesson (Micro) or in her weekly (Meta) 
probe. For example, during one CBM probe on Concepts and Applications, Sarah either skipped or made 
errors in all items dealing with reducing fractions. This resulted in a change of instruction in math class 
for the next week whereby the teacher provided deliberate timed practice examples on reducing fractions, 
which aligned with the concept as it was taught in Sarah’s daily Saxon lessons. The data from these timed 
practices were graphed on her “Saxon Dump Chart.” As soon as the next CBM measure yielded scores 
that showed Sarah correctly answered all reducing fraction problems, the teacher then moved onto 
another concept in which Sarah needed more help. The subsequent concept was targeted next on the 
“Saxon Dump Chart” in daily practice. This process of monitoring performance on the Standard 
Celeration Chart and examining CBM data continued throughout the year. 

Results 

The pretest and posttest Macro level test scores for Sarah and Mason are provided in Table 2. The 
outcomes illustrate the effectiveness of integrating frequency-based instruction within a multi-level Rtl 
system. 


Table 2. Macros Level Pretest and Postest Scores Used for Program Evaluation 




Eligibility 

WJ-III 

Pre Test 

Post Test 

Pre-Post Test 

Student 

Grade 

Category 

Subtest Title 

GE score 

GE score 

in 10 months 




Calculation 

3.0 

3.4 

+4 months 

Sarah 

4 

Learning 

Math Fluency 

2.2 

4.8 

+2 years, 6 months 



Disability 

Applied Problems 

3.5 

5.6 

+2 years, 1 month 




Quantitative 

Concepts 

1.4 

5.5 

+4 years, 1 month 




Calculation 

4.4 

5.4 

+1 year 

Mason 

4 

Autism 

Math Fluency 

3.0 

9.8 

+6 years, 8 months 




Applied Problems 

4.2 

5.2 

+1 year 




Quantitative 

Concepts 

4.2 

6.3 

+2 years, 1 month 


Note. WJ-III = Woodcock-Johnson III Achievement Test; GE = Grade Equivalent. 

Both students made significant improvement over the course of the academic year. Specifically, 
Sarah gained more than two grade levels in one academic year on the Math Fluency and Applied 
Problems subtests. And on the Quantitative Concepts subtest, she gained over four grade levels in one 
academic year. At the end of 4 th grade, Sarah caught up to the average range of her peers in these skill 
areas. However, for the Calculation subtest, the gap between her skills and those of her peers had widened 
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such that she was 1.0 grade level behind in the beginning of the year and 1.6 grade levels behind by the 
end of the year). This downward trend was addressed intensely in her 5 th grade year. Similarly, Mason’s 
posttest scores showed impressive gains. Mason gained one grade level in two subjects (Calculation and 
Applied Problems). Meanwhile, he gained just over two grade levels on the Quantitative Concepts subtest 
while advancing nearly approximately 6.8 grade levels on the Math Fluency subtest. 

Discussion 

We described a model for integrating frequency-based instruction with a multi-level assessment 
system to enhance Rtl frameworks in order to improve mathematics outcomes. The assessment system 
includes Macro, Meta, and Micro level measures. The Macro level measures were used to place students 
in the mathematics curriculum using the pretest scores and to evaluate the overall effectiveness of the 
instructional program based on posttest scores. The Meta level measures were used to monitor weekly 
learning progress and guide instruction at the Micro level. The Micro level frequency-based practice and 
assessment were, in turn, used to drive progress at the Meta and Macro levels. 

The results revealed that both of the highlighted 4 th grade students made significant mathematics 
progress over the course of the Tier 2 intervention. Of particular interest, despite being qualified for 
special education services under the eligibility category of Specific Learning Disability in the public 
school system, Sarah gained over four grade levels in quantitative reasoning skills and over two grade 
levels in applied reasoning and math fluency skills. These gains occurred in only ten months between her 
pre and post assessments and while she was placed in a small group general education class. She did not 
require the additional pull out special education services assigned to her through her IEP. We believe that 
these outcomes challenge the legitimacy of using the dual discrepancy model to qualify students for 
special education services due to the fact that with high quality instruction and on-going progress 
monitoring within her general education classroom, Sarah no longer needed an individualized education 
plan. 


Additionally, the student diagnosed with Autism made dramatic progress in one academic year as 
well. Mason’s scores on the Applied Problems and Calculation subtests maintained their normative status 
compared to his peers as evidenced by a one year gain in each skill area and his performance on the 
Quantitative Concepts test increased by more than two grade levels. However, Mason’s gains on his 
Math Fluency scores dramatically highlight the power of the multi-level system of assessment. His 
performance yielded a score similar to a student almost 5 grade levels ahead of him with a gain of over 6 
grade levels! This improvement is the direct result of daily frequency-based practice on basic math facts 
using PT instructional design and delivery methods which required approximately 10 minutes of his class 
time each day. It is especially noteworthy that this multi-level system compares Mason’s scores to those 
of non-diagnosed students of the same grade level at both the Macro level (WJ-III) and the Meta level 
(MBSP). Here is a student with a diagnosis marked by developmental delays, monitored by general 
education standards within a general education class structure, and the results show that he caught up to 
his peers or surpassed them in certain areas. Policy makers, educators, and behavior analysts who work 
within and in conjunction with the field of autism intervention should take note of Mason’s case. Further 
research into the efficacy of holding students with autism to the same academic standards as their neuro¬ 
typical peers, which employs an Rtl model using PT methods within a multi-leveled system of progress 
monitoring, is an exciting prospect. 

In addition to the field of autism, more rigorous research is needed to determine the extent to 
which such outcomes can be consistently achieved from students labeled with mild to moderate learning 
problems given individualized frequency-based mathematics instruction using PT methods delivered 
within a multi-level assessment system. It is possible that if the same curriculum were used without the 
frequency building practice, the results would differ. Alternatively, it is possible that if the same 
curriculum and instructional methods were used, but assessment at the Meta level was less frequent, the 
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results would not differ. Consequently, more research is needed to determine the essential instructional 
and assessment ingredients that rendered this particular recipe so successful. 

The assumptions made about student success or failure are being challenged by a system that 
currently provides special education services to approximately 6.8 million children and youth served by 
law under IDEA (DOE, 2010). According to the United States Department of Labor Bureau of Labor 
Statistics, in order to serve these students, the U.S. employed 473,000 Special education teachers (as of 
2008) with an anticipated increase of 17 percent from 2008 to 2018, making special education teachers 
the most rapidly-growing occupation. When federal funds are added to state and local funding, an 
estimate from nearly ten years ago stated “$35-$60 billion is spent annually on special education in this 
country, with possibly 40 percent of all new spending on K-12 education over the past 30 years spent on 
special education” (Finn, Rotherham, & Hokanson, 2001). Many school districts are adopting the Rtl 
framework as a way to more easily satisfy the No Child Left Behind mandate to maintain current funding 
levels and to also be competitive for the new federally-funded Race to the Top initiative (2009) with 
grants that range from $25-700 million. These facts and figures may be daunting, yet educators and 
psychologists who stand prepared in the classrooms with effective assessment tools and methodologies, 
and who have the commitment to serve all children can view this challenge with optimism. One such 
proclamation is that Rtl stands for “Really Terrific Instruction” (Tilly, 2008). 

Given the growing acceptance and enthusiasm for Rtl in the general education realm, and the fact 
that IDEA (2004) requires that schools must have procedures in place such “that special classes, separate 
schooling, or other removal of children with disabilities from the regular educational environment occurs 
only when the nature or severity of the disability is such that education in regular classes with the use of 
supplementary aids and services cannot be achieved satisfactorily,” Rtl provides yet another opportunity 
for nationwide educational improvements for both general and special education students. The least 
restrictive environment for a student previously identified as in need of special education services, may 
one day be our general education classrooms where we will find that effective instruction and regular 
formative assessment will be present for all learners. The goal of the system described in this paper is 
highly effective and inclusive education for all students in the same general education classroom. 

More than ever before, the push for bringing data-based decision making into the hands of 
teachers directly impacting their students on a daily basis opens the door for a powerful partnership 
between the fields of education and Applied Behavior Analysis. Interestingly enough, behavior analysts 
might recognize how our contributions in education have been aligned with the goals of Rtl for decades. 
To understand the history of Responsiveness to Intervention and its use of formative assessment, those 
who are familiar with formative evaluation will recognize the connection. In 1989, Susan Markle wrote 
an article entitled, "The Ancient History of Formative Evaluation." Here she traces the work of behavior 
analysts from Skinner's work in the laboratory using shaping procedures with pigeons to the complex 
design of programmed instruction and reminded us that over twenty years ago, 

"The process now called "formative evaluation is an old idea with many names, In 
those early days, we called it "developmental testing" or simply "tryout," Evaluation 
experts (Lumsdaine,1965; Scriven, 1967) clarified our ideas by distinguishing 
between "formative" and "summative" evaluation" (p.27). 

In Rtl protocols, it is the teacher who is now responsible for examining and shaping the 
responsiveness to instruction. What was once the responsibility assumed by the instructional designer 
of programmed instruction (at the Micro level) is now required of both the instructional material 
(evidence-based instruction) and the instructional leaders - including the principals and teachers. The 
curricula and its delivery system stand at the forefront of the system’s accountability and greatly 
influence whether a student will be identified as eligible for special education services. In Rtl, we 
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might consider all the variables that contribute to efficient and effective learning, and refer to them as 
“the program.” According to Markle (1969), when progress does not occur, “if the student errs, the 
programmer flunk s ” (p. 16). 

Another article written by a behavioral scientist nearly 50 years ago discusses “Dimensions of the 
Need” describing the very same variables that can be teased out in an Rtl model. According to Padwa, in 
his discussion of programmed instruction, he says “the behavioral development and testing of 
instructional programs assure an unprecedented appropriateness of teaching methods to the backgrounds 
and actual preparation levels of the students taught, as well as a precise evaluation of its own 
effectiveness.” Padwa’s success with programmed instruction led him to affirm that programmed 
instruction had the “ability to guarantee high achievement ” (Padwa, 1962), emphasis in original). These 
same principles of sound instructional design can once again be studied and brought to the classroom by 
behavior analysts and teachers for their students. 

The power of using a multi-level system of assessment as a formative decision making tool for 
students and teachers is the alignment of curriculum to assessment and on-going progress monitoring in a 
general education classroom for all students, regardless of diagnoses or ability levels. Although our 
investigation briefly describes a system with case studies in mathematics, the procedures can be extended 
to the typical public school classroom when all resources are optimally employed. This assessment 
system can be utilized for any academic subject area where there exist normative standards for mastery. 
Creating a powerful alliance of the expertise of those who know how to teach (teachers) with those who 
know how to analyze (behavior analysts) is an exciting prospect which will likely result in substantially 
positive impact for future generations of students with and without disabilities. 
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